人脸图像通常以广泛的视觉量表出现。现有的面部表示通过组装有限系列的预定尺度的多尺度方案来追求处理量表变化的带宽。这种多弹药方案带来了推理负担,而预定义的量表不可避免地从真实数据中差异。取而代之的是,从数据中学习比例参数,并将其用于单发功能推理是一个不错的解决方案。为此,我们通过诉诸规模空间理论并实现两倍的设施来改革Conv层:1)Conv层从真实数据分布中学习一组尺度,每个数据分布都由Conv内核来实现; 2)该图层自动在适当的通道和位置上突出显示与输入模式量表及其存在相对应的位置。然后,我们通过堆叠改革层的层来实现分层尺度的关注,建立一种名为“比例尺注意Cons Neurnet网络”(\ textbf {scan-cnn})的新颖风格。我们将扫描CNN应用于面部识别任务,并推动SOTA性能的前沿。当面部图像模糊时,准确性增长更为明显。同时,作为单发方案,该推断比多弹性融合更有效。与普通CNN相比,制造了一组工具,以确保对扫描CNN进行快速训练和推理成本的零增加。
translated by 谷歌翻译
嵌入式模型是高维数据的有效学习范例。但是,嵌入模型的一个开放问题是它们的表示(潜在因子)通常会导致大参数空间。我们观察到,现有的分布式训练框架面临嵌入模型的可伸缩性问题,因为从服务器的共享嵌入参数更新和检索共享嵌入参数通常占主导地位培训周期。在本文中,我们提出了一种新的系统框架,可显着提高巨大嵌入模型培训的可扩展性。我们拥抱嵌入的嵌入式作为绩效机会的倾斜流行分布,并利用它来解决具有嵌入缓存的通信瓶颈。为确保缓存跨越一致性,我们将新的一致性模型纳入HET设计,该模型提供了在每嵌入的基础上提供细粒度的一致性保证。与以前的工作相比,只允许读取操作的僵化,HET也利用了写入操作的血液性。六种代表性任务的评估表明,在最先进的基线上,HET达到高达88%的嵌入通信减少和高达20.68倍的性能加速。
translated by 谷歌翻译
面部反欺骗(FAS)在防止演示攻击中的人脸识别系统中起着至关重要的作用。由于身份和微不足道的方差不足,现有面部反欺骗数据集缺乏多样性,这限制了FAS模型的泛化能力。在本文中,我们提出了双重欺骗解散生成(DSDG)框架,通过“通过生成反欺骗”来解决这一挑战。根据变形AutiaceDer(VAE)中的可解释分解潜在解剖学,DSDG学习身份表示的联合分布和潜在空间中的欺骗模式表示。然后,可以从随机噪声生成大规模成对的实时和欺骗图像,以提高训练集的分集。然而,由于VAE的固有缺陷,一些产生的面部图像被部分地扭曲。这种嘈杂的样本很难预测精确的深度值,因此可能阻碍广泛使用的深度监督优化。为了解决这个问题,我们进一步引入了轻量级深度不确定性模块(DUM),减轻了噪声样本对深度不确定性学习的不利影响。 DUM在没有依赖性的情况下开发,因此可以灵活地集成与任何深度监督网络进行面部反欺骗。我们评估了提出的方法在五个流行基准上的有效性,并在测试中实现了最先进的结果。该代码可在https://github.com/jdai-cv/facex-zoo/tree/main/addition_module/dsdg中获得。
translated by 谷歌翻译
人脸识别是计算机视觉中最受欢迎和最长的主题之一。随着最近的深度学习技术和大规模数据集的发展,深刻的面貌识别取得了显着的进展,并广泛用于许多现实世界应用。给定自然图像或视频帧作为输入,端到端的深脸识别系统输出面部特征以识别。为此,典型的端到端系统通常具有三个关键元素:面部检测,面部对准和面部表示。面部检测定位在图像或框架中的面部。然后,继续进行面对准以将面部校准到规范视图并将它们裁剪到归一化的像素尺寸。最后,在面部表示的阶段,从对准的面部提取歧视特征以进行识别。如今,所有三个要素都是通过深度卷积神经网络的技术实现的。在本调查文章中,我们对最近的端到端深刻识别的每个元素的进步进行了全面的审查,因为蓬勃发展学习技巧极大地提高了它们的能力。首先,我们概述了端到端深表识的概述。然后,我们分别审查每个元素的前进,涵盖许多方面,例如迄今的算法设计,评估指标,数据集,性能比较,对未来研究的有希望和有希望的方向。通过这项调查,我们希望在两个方面提出贡献:首先,读者可以方便地识别子类别中具有相当强大的基础风格的方法,以进一步探索;其次,人们还可以采用合适的方法来从划痕建立最先进的端到端面部识别系统。
translated by 谷歌翻译
The development of social media user stance detection and bot detection methods rely heavily on large-scale and high-quality benchmarks. However, in addition to low annotation quality, existing benchmarks generally have incomplete user relationships, suppressing graph-based account detection research. To address these issues, we propose a Multi-Relational Graph-Based Twitter Account Detection Benchmark (MGTAB), the first standardized graph-based benchmark for account detection. To our knowledge, MGTAB was built based on the largest original data in the field, with over 1.55 million users and 130 million tweets. MGTAB contains 10,199 expert-annotated users and 7 types of relationships, ensuring high-quality annotation and diversified relations. In MGTAB, we extracted the 20 user property features with the greatest information gain and user tweet features as the user features. In addition, we performed a thorough evaluation of MGTAB and other public datasets. Our experiments found that graph-based approaches are generally more effective than feature-based approaches and perform better when introducing multiple relations. By analyzing experiment results, we identify effective approaches for account detection and provide potential future research directions in this field. Our benchmark and standardized evaluation procedures are freely available at: https://github.com/GraphDetec/MGTAB.
translated by 谷歌翻译
Image Virtual try-on aims at replacing the cloth on a personal image with a garment image (in-shop clothes), which has attracted increasing attention from the multimedia and computer vision communities. Prior methods successfully preserve the character of clothing images, however, occlusion remains a pernicious effect for realistic virtual try-on. In this work, we first present a comprehensive analysis of the occlusions and categorize them into two aspects: i) Inherent-Occlusion: the ghost of the former cloth still exists in the try-on image; ii) Acquired-Occlusion: the target cloth warps to the unreasonable body part. Based on the in-depth analysis, we find that the occlusions can be simulated by a novel semantically-guided mixup module, which can generate semantic-specific occluded images that work together with the try-on images to facilitate training a de-occlusion try-on (DOC-VTON) framework. Specifically, DOC-VTON first conducts a sharpened semantic parsing on the try-on person. Aided by semantics guidance and pose prior, various complexities of texture are selectively blending with human parts in a copy-and-paste manner. Then, the Generative Module (GM) is utilized to take charge of synthesizing the final try-on image and learning to de-occlusion jointly. In comparison to the state-of-the-art methods, DOC-VTON achieves better perceptual quality by reducing occlusion effects.
translated by 谷歌翻译
Dynamic treatment regimes assign personalized treatments to patients sequentially over time based on their baseline information and time-varying covariates. In mobile health applications, these covariates are typically collected at different frequencies over a long time horizon. In this paper, we propose a deep spectral Q-learning algorithm, which integrates principal component analysis (PCA) with deep Q-learning to handle the mixed frequency data. In theory, we prove that the mean return under the estimated optimal policy converges to that under the optimal one and establish its rate of convergence. The usefulness of our proposal is further illustrated via simulations and an application to a diabetes dataset.
translated by 谷歌翻译
As natural language processing (NLP) for gender bias becomes a significant interdisciplinary topic, the prevalent data-driven techniques such as large-scale language models suffer from data inadequacy and biased corpus, especially for languages with insufficient resources such as Chinese. To this end, we propose a Chinese cOrpus foR Gender bIas Probing and Mitigation CORGI-PM, which contains 32.9k sentences with high-quality labels derived by following an annotation scheme specifically developed for gender bias in the Chinese context. Moreover, we address three challenges for automatic textual gender bias mitigation, which requires the models to detect, classify, and mitigate textual gender bias. We also conduct experiments with state-of-the-art language models to provide baselines. To our best knowledge, CORGI-PM is the first sentence-level Chinese corpus for gender bias probing and mitigation.
translated by 谷歌翻译
Off-policy evaluation (OPE) is a method for estimating the return of a target policy using some pre-collected observational data generated by a potentially different behavior policy. In some cases, there may be unmeasured variables that can confound the action-reward or action-next-state relationships, rendering many existing OPE approaches ineffective. This paper develops an instrumental variable (IV)-based method for consistent OPE in confounded Markov decision processes (MDPs). Similar to single-stage decision making, we show that IV enables us to correctly identify the target policy's value in infinite horizon settings as well. Furthermore, we propose an efficient and robust value estimator and illustrate its effectiveness through extensive simulations and analysis of real data from a world-leading short-video platform.
translated by 谷歌翻译
Off-Policy evaluation (OPE) is concerned with evaluating a new target policy using offline data generated by a potentially different behavior policy. It is critical in a number of sequential decision making problems ranging from healthcare to technology industries. Most of the work in existing literature is focused on evaluating the mean outcome of a given policy, and ignores the variability of the outcome. However, in a variety of applications, criteria other than the mean may be more sensible. For example, when the reward distribution is skewed and asymmetric, quantile-based metrics are often preferred for their robustness. In this paper, we propose a doubly-robust inference procedure for quantile OPE in sequential decision making and study its asymptotic properties. In particular, we propose utilizing state-of-the-art deep conditional generative learning methods to handle parameter-dependent nuisance function estimation. We demonstrate the advantages of this proposed estimator through both simulations and a real-world dataset from a short-video platform. In particular, we find that our proposed estimator outperforms classical OPE estimators for the mean in settings with heavy-tailed reward distributions.
translated by 谷歌翻译